智能论文笔记

Causal Modeling of Soil Processes for Improved Generalization

Somya Sharma , Swati Sharma , Andy Neal , Sara Malvar , Eduardo Rodrigues , John Crawford , Emre Kiciman , Ranveer Chandra

分类：机器学习

2022-11-10

Measuring and monitoring soil organic carbon is critical for agricultural productivity and for addressing critical environmental problems. Soil organic carbon not only enriches nutrition in soil, but also has a gamut of co-benefits such as improving water storage and limiting physical erosion. Despite a litany of work in soil organic carbon estimation, current approaches do not generalize well across soil conditions and management practices. We empirically show that explicit modeling of cause-and-effect relationships among the soil processes improves the out-of-distribution generalizability of prediction models. We provide a comparative analysis of soil organic carbon estimation models where the skeleton is estimated using causal discovery methods. Our framework provide an average improvement of 81% in test mean squared error and 52% in test mean absolute error.

translated by 谷歌翻译

Approximate Bayesian Computation for Physical Inverse Modeling

Neel Chatterjee , Somya Sharma , Sarah Swisher , Snigdhansu Chatterjee

分类：机器学习 | (统计)机器学习

2021-11-26

半导体器件模型对于了解薄膜晶体管（TFT）中的电荷传输至关重要。使用这些TFT模型绘制推断涉及估计用于适合实验数据的参数。这些实验数据可以涉及提取的电荷载流子迁移率或测量的电流。估计这些参数有助于我们借助有关设备性能的推论。使用模型参数拟合给定的实验数据的TFT模型依赖于人类专家手动微调多个参数。这些参数中的几个可能对实验数据具有混杂影响，使其各自的效果提取在手动调谐期间的非直观过程。为避免这种复杂的过程，我们提出了一种新的方法，用于自动化模型参数提取过程，从而实现精确的模型配件。在这项工作中，基于模型选择的近似贝叶斯计算（ABC）用于在各种栅极电压值下使用观察到的移动性产生估计参数的后部分布。此外，示出了通过使用梯度提升的树从迁移率曲线准确地预测提取的参数。这项工作还提供了对具有微调神经网络的所提出的框架的比较分析，其中所提出的框架被示出更好地执行。

translated by 谷歌翻译

Tsetlin Machine Embedding: Representing Words Using Logical Expressions

Bimal Bhattarai , Ole-Christoffer Granmo , Lei Jiao , Rohan Yadav , Jivitesh Sharma

分类：自然语言处理 | 人工智能 | 机器学习

2023-01-02

Embedding words in vector space is a fundamental first step in state-of-the-art natural language processing (NLP). Typical NLP solutions employ pre-defined vector representations to improve generalization by co-locating similar words in vector space. For instance, Word2Vec is a self-supervised predictive model that captures the context of words using a neural network. Similarly, GLoVe is a popular unsupervised model incorporating corpus-wide word co-occurrence statistics. Such word embedding has significantly boosted important NLP tasks, including sentiment analysis, document classification, and machine translation. However, the embeddings are dense floating-point vectors, making them expensive to compute and difficult to interpret. In this paper, we instead propose to represent the semantics of words with a few defining words that are related using propositional logic. To produce such logical embeddings, we introduce a Tsetlin Machine-based autoencoder that learns logical clauses self-supervised. The clauses consist of contextual words like "black," "cup," and "hot" to define other words like "coffee," thus being human-understandable. We evaluate our embedding approach on several intrinsic and extrinsic benchmarks, outperforming GLoVe on six classification tasks. Furthermore, we investigate the interpretability of our embedding using the logical representations acquired during training. We also visualize word clusters in vector space, demonstrating how our logical embedding co-locate similar words.

translated by 谷歌翻译

A Comparative Study of Image Disguising Methods for Confidential Outsourced Learning

Sagar Sharma , Yuechun Gu , Keke Chen

分类：机器学习

2022-12-31

Large training data and expensive model tweaking are standard features of deep learning for images. As a result, data owners often utilize cloud resources to develop large-scale complex models, which raises privacy concerns. Existing solutions are either too expensive to be practical or do not sufficiently protect the confidentiality of data and models. In this paper, we study and compare novel \emph{image disguising} mechanisms, DisguisedNets and InstaHide, aiming to achieve a better trade-off among the level of protection for outsourced DNN model training, the expenses, and the utility of data. DisguisedNets are novel combinations of image blocktization, block-level random permutation, and two block-level secure transformations: random multidimensional projection (RMT) and AES pixel-level encryption (AES). InstaHide is an image mixup and random pixel flipping technique \cite{huang20}. We have analyzed and evaluated them under a multi-level threat model. RMT provides a better security guarantee than InstaHide, under the Level-1 adversarial knowledge with well-preserved model quality. In contrast, AES provides a security guarantee under the Level-2 adversarial knowledge, but it may affect model quality more. The unique features of image disguising also help us to protect models from model-targeted attacks. We have done an extensive experimental evaluation to understand how these methods work in different settings for different datasets.

translated by 谷歌翻译

Quantum-Inspired Tensor Neural Networks for Option Pricing

Raj G. Patel , Chia-Wei Hsing , Serkan Sahin , Samuel Palmer , Saeed S. Jahromi , Shivam Sharma , Tomas Dominguez , Kris Tziritas , Christophe Michel , Vincent Porte

分类：机器学习

2022-12-28

Recent advances in deep learning have enabled us to address the curse of dimensionality (COD) by solving problems in higher dimensions. A subset of such approaches of addressing the COD has led us to solving high-dimensional PDEs. This has resulted in opening doors to solving a variety of real-world problems ranging from mathematical finance to stochastic control for industrial applications. Although feasible, these deep learning methods are still constrained by training time and memory. Tackling these shortcomings, Tensor Neural Networks (TNN) demonstrate that they can provide significant parameter savings while attaining the same accuracy as compared to the classical Dense Neural Network (DNN). In addition, we also show how TNN can be trained faster than DNN for the same accuracy. Besides TNN, we also introduce Tensor Network Initializer (TNN Init), a weight initialization scheme that leads to faster convergence with smaller variance for an equivalent parameter count as compared to a DNN. We benchmark TNN and TNN Init by applying them to solve the parabolic PDE associated with the Heston model, which is widely used in financial pricing theory.

translated by 谷歌翻译

A System-Level View on Out-of-Distribution Data in Robotics

Rohan Sinha , Apoorva Sharma , Somrita Banerjee , Thomas Lew , Rachel Luo , Spencer M. Richards , Yixiao Sun , Edward Schmerling , Marco Pavone

分类：机器人 | 机器学习

2022-12-28

When testing conditions differ from those represented in training data, so-called out-of-distribution (OOD) inputs can mar the reliability of black-box learned components in the modern robot autonomy stack. Therefore, coping with OOD data is an important challenge on the path towards trustworthy learning-enabled open-world autonomy. In this paper, we aim to demystify the topic of OOD data and its associated challenges in the context of data-driven robotic systems, drawing connections to emerging paradigms in the ML community that study the effect of OOD data on learned models in isolation. We argue that as roboticists, we should reason about the overall system-level competence of a robot as it performs tasks in OOD conditions. We highlight key research questions around this system-level view of OOD problems to guide future research toward safe and reliable learning-enabled autonomy.

translated by 谷歌翻译

On the Equivalence of the Weighted Tsetlin Machine and the Perceptron

Jivitesh Sharma , Ole-Christoffer Granmo , Lei Jiao

分类：机器学习

2022-12-27

Tsetlin Machine (TM) has been gaining popularity as an inherently interpretable machine leaning method that is able to achieve promising performance with low computational complexity on a variety of applications. The interpretability and the low computational complexity of the TM are inherited from the Boolean expressions for representing various sub-patterns. Although possessing favorable properties, TM has not been the go-to method for AI applications, mainly due to its conceptual and theoretical differences compared with perceptrons and neural networks, which are more widely known and well understood. In this paper, we provide detailed insights for the operational concept of the TM, and try to bridge the gap in the theoretical understanding between the perceptron and the TM. More specifically, we study the operational concept of the TM following the analytical structure of perceptrons, showing the resemblance between the perceptrons and the TM. Through the analysis, we indicated that the TM's weight update can be considered as a special case of the gradient weight update. We also perform an empirical analysis of TM by showing the flexibility in determining the clause length, visualization of decision boundaries and obtaining interpretable boolean expressions from TM. In addition, we also discuss the advantages of TM in terms of its structure and its ability to solve more complex problems.

translated by 谷歌翻译

SLUE Phase-2: A Benchmark Suite of Diverse Spoken Language Understanding Tasks

Suwon Shon , Siddhant Arora , Chyi-Jiunn Lin , Ankita Pasad , Felix Wu , Roshan Sharma , Wei-Lun Wu , Hung-Yi Lee , Karen Livescu , Shinji Watanabe

分类：自然语言处理

2022-12-20

Spoken language understanding (SLU) tasks have been studied for many decades in the speech research community, but have not received as much attention as lower-level tasks like speech and speaker recognition. In particular, there are not nearly as many SLU task benchmarks, and many of the existing ones use data that is not freely available to all researchers. Recent work has begun to introduce such benchmark datasets for several tasks. In this work, we introduce several new annotated SLU benchmark tasks based on freely available speech data, which complement existing benchmarks and address gaps in the SLU evaluation landscape. We contribute four tasks: question answering and summarization involve inference over longer speech sequences; named entity localization addresses the speech-specific task of locating the targeted content in the signal; dialog act classification identifies the function of a given speech utterance. We follow the blueprint of the Spoken Language Understanding Evaluation (SLUE) benchmark suite. In order to facilitate the development of SLU models that leverage the success of pre-trained speech representations, we will be publishing for each task (i) annotations for a relatively small fine-tuning set, (ii) annotated development and test sets, and (iii) baseline models for easy reproducibility and comparisons. In this work, we present the details of data collection and annotation and the performance of the baseline models. We also perform sensitivity analysis of pipeline models' performance (speech recognizer + text model) to the speech recognition accuracy, using more than 20 state-of-the-art speech recognition models.

translated by 谷歌翻译

A Comparison Between Tsetlin Machines and Deep Neural Networks in the Context of Recommendation Systems

Karl Audun Borgersen , Morten Goodwin , Jivitesh Sharma

分类：人工智能

2022-12-20

Recommendation Systems (RSs) are ubiquitous in modern society and are one of the largest points of interaction between humans and AI. Modern RSs are often implemented using deep learning models, which are infamously difficult to interpret. This problem is particularly exasperated in the context of recommendation scenarios, as it erodes the user's trust in the RS. In contrast, the newly introduced Tsetlin Machines (TM) possess some valuable properties due to their inherent interpretability. TMs are still fairly young as a technology. As no RS has been developed for TMs before, it has become necessary to perform some preliminary research regarding the practicality of such a system. In this paper, we develop the first RS based on TMs to evaluate its practicality in this application domain. This paper compares the viability of TMs with other machine learning models prevalent in the field of RS. We train and investigate the performance of the TM compared with a vanilla feed-forward deep learning model. These comparisons are based on model performance, interpretability/explainability, and scalability. Further, we provide some benchmark performance comparisons to similar machine learning solutions relevant to RSs.

translated by 谷歌翻译

AdverSAR: Adversarial Search and Rescue via Multi-Agent Reinforcement Learning

Aowabin Rahman , Arnab Bhattacharya , Thiagarajan Ramachandran , Sayak Mukherjee , Himanshu Sharma , Ted Fujimoto , Samrat Chatterjee

分类：机器人 | 机器学习

2022-12-20

Search and Rescue (SAR) missions in remote environments often employ autonomous multi-robot systems that learn, plan, and execute a combination of local single-robot control actions, group primitives, and global mission-oriented coordination and collaboration. Often, SAR coordination strategies are manually designed by human experts who can remotely control the multi-robot system and enable semi-autonomous operations. However, in remote environments where connectivity is limited and human intervention is often not possible, decentralized collaboration strategies are needed for fully-autonomous operations. Nevertheless, decentralized coordination may be ineffective in adversarial environments due to sensor noise, actuation faults, or manipulation of inter-agent communication data. In this paper, we propose an algorithmic approach based on adversarial multi-agent reinforcement learning (MARL) that allows robots to efficiently coordinate their strategies in the presence of adversarial inter-agent communications. In our setup, the objective of the multi-robot team is to discover targets strategically in an obstacle-strewn geographical area by minimizing the average time needed to find the targets. It is assumed that the robots have no prior knowledge of the target locations, and they can interact with only a subset of neighboring robots at any time. Based on the centralized training with decentralized execution (CTDE) paradigm in MARL, we utilize a hierarchical meta-learning framework to learn dynamic team-coordination modalities and discover emergent team behavior under complex cooperative-competitive scenarios. The effectiveness of our approach is demonstrated on a collection of prototype grid-world environments with different specifications of benign and adversarial agents, target locations, and agent rewards.

translated by 谷歌翻译